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CRAWLING THE COSMIC NETWORK: IDENTIFYING AND QUANTIFYING FILAMENTARY STRUCTURE 
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ABSTRACT 

We present the Smoothed Hessian Major Axis Filament Finder (SHMAFF), an algorithm that 
uses the eigenvectors of the Hessian matrix of the smoothed galaxy distribution to identify individual 
filamentary structures. Filaments are traced along the Hessian eigenvector corresponding to the largest 
eigenvalue, and are stopped when the axis orientation changes more rapidly than a preset threshold. 
In both N-body simulations and the Sloan Digital Sky Survey (SDSS) main galaxy redshift survey 
data, the resulting filament length distributions are approximately exponential. In the SDSS galaxy 
distribution, using smoothing lengths of 10 ft-^^ Mpc and 15 h^^ Mpc, we find filament lengths per 
unit volume of 1.9 x 10~^ h? Mpc~^ and 7.6 x 10"'' h? Mpc~^, respectively. The filament width 
distributions, which are much more sensitive to non-linear growth, are also consistent between the 
real and mock galaxy distributions using a standard cosmology. In SDSS, we find mean filament 
widths of 5.5 Mpc and 8.4 Mpc on 10 h'"^ Mpc and 15 Mpc smoothing scales, with 
standard deviations of 1.1 Mpc and 1.4 Mpc, respectively. Finally, the spatial distribution of 
filamentary structure in simulations is very similar between z = 3 and z = on smoothing scales as 
large as 15 Mpc, suggesting that the outline of filamentary structure is already in place at high 
redshift. 

Subject headings: cosmology: observations — cosmology: large-scale structure of universe - cosmology: 
theory 



1. INTRODUCTION 



Observational evidence for filamentary structures in 
the large-scale distribution of gala xies was first presented 
in ga la xy rcdshifts s urvey s (e.g . Thompson & Gregorv* 
19781 iDavis et all [1981 fdeLapparent et aL 1986i: 



Sathvaprakash et alHiggStlCoUess et aLll2001l:lGott et all 



200^ When similar structures were seen in cosmologi- 
cal N- body simula ti ons o f the dark matter distr i bution 
(e.g. 'Bond et al.' '1996; 'Sathvaprakash et al.| 119961 : 
lAragon -Calvo ct al. 2007; Halm ct al. 2007a), a picture 
of a vast 'cosmic web,' in which filaments skirted the 
boundaries of voids and were connected by galaxy 
clusters, began to emerge. These filaments are thought 
to provide pa thways for matter t o accrete onto galaxy 
clusters (e.g. iTanaka et al.l l2007| ) and to torque dark 
matter halos to align their spin axes (jHahn et al.l 
l2007al| bl. [2009 ') . Filaments also produce deep potential 
wells and will give ri se to a gravitational lensing signal 
on th e largest scales ([Dietrich et al.l [20051 : iMassev et al.l 
|2007[) . A number of authors have claimed detections 
of filaments using weak lensing (e.g. 'Kai ser et al.lll998[ : 
[Dietrich ct al. 2005; Massey et al. 2007), but simulations 
predict that structure along the line of sight should 
produce shear coni parable to that of the target filaments 
([Dolag et al.l I2006D and the evidence remains far from 
conclusive. In addition, the formation of filaments is 
accompanied by gravitational heating, which gradualy 
raises the temperature of the intergalactic medium over 
time and produces the so-calle d warm-hot interg alactic 
medium (WHIM) by z = (e.g. lCen fc Ostriked[l99 9). 

Perhaps the simplest and most effective means of iden- 
tifying clusters in discretely sampled fields, such as red- 

^ nbond@physics.rutgers.edu 



shift surveys and N-bo dy simulations, is the friends-of- 
friends algorithm (FOF. lHuchra fc Gelled[i9821 in which 
particle groups are assembled based on the separation of 
nearest neighbors. These EOF structures can then be 
quantified with 'Shapefinders,' statistics which measure 
the length, breadth, and thickness of structures and are 
related to the Min kowski functionals (Sahni et al. 1998). 
iSheth et al.l (|2003[ ) have developed an algorithm for com- 
puting the Shapefinders on structures at an arbitrary 
density threshold. Many of those found in data and 
simulations are indeed filamentary, but FOE algorithms 
are optimized for structures that lie above a set density 
threshold, a condition approximately met by clusters at 
the present epoch. Filaments and walls, however, are not 
bound and a strict density cut alone would not provide 
clean samples of such structures. 

An o ther algorithm , called the Skeleton (jNovikov et al.l 
120061: iSousbie et all l2008al [bl). identifies filaments by 
searching for saddle points in a density field and then 
following the density gradient along the filament un- 
til it reaches a local maximum. Although it appears 
to be effective at making an outline of the cosmic net- 
work, it lacks an intuitiv e definition of filament ends. 
lAragon-Calvo et al.l (|2008| ) also lacks such definition, but 
has been successful at tracing the filament network in 
cosmolo g,ical simulations u sing watershed segmentation 
(see also [Platen et al.ll2007D and a Delaunay tessella tion 
density estimator ([Schaap fc van de Wevgaert[[2000l) . If 
we wish to analyze filament length distributions or their 
spatial relationship to clusters, it is important to sepa- 
rate individual filaments in the cosmic web. Structure- 
finding techniques that only detect filaments between 
galaxy cluster pairs (e.g. Pimbbleti 120051 : iColberg et al.l 

' 20091 ) would present a biased 
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view of the filament-cluster relationship. 

An early technique for identifyi ng filaments in two - 
dimensional data was developed bv lMoodv eFall (|1983[ ) 
that works on a similar principle to the algorithm de- 
scribed in this paper. It divides the density field into 
a pixelized grid and identifies as filament elements any 
grid cell that has a larger density than its immediate 
neighbors along two of the four axes (including the two 
coordinate axes and two axes at 45° angles to the grid) 
through the grid cell. The algorithm was run on the 
Shaii e-Wirtanen galaxy count catalogue (S eldner et al.l 
Il977( ). but has not b een developed furt her. A later algo- 
rithm, presented by iDave et "all (|1997f) , works on a sim- 
ilar principle, identiiying "linked sequences" using the 
eigenvectors of the inertia tensor. The authors found 
that the algorithm was poor at discriminating between 
cosmological models using CfAl-like mock galaxy cata- 
logs, but primarily because of the small number of galax- 
ies in the catalogs. 

In Paper 1, we used the distribution of the Hessian 
eigenvalues of the smoothed density field (A-space) on a 
grid to study three types of structure: clumps, filaments, 
and walls. Filaments were found in the A-space distribu- 
tions at a variety of smoothing scales, ranging at least 
from 5 — 15 Mpc, in both N-body simulations and 
the galaxy distribu tion measured by the Sloan Digital 
Sky Survey fSDSS. lYOTkeFall [2000li . Furthermore, fil- 
aments were found to dominate the large-scale distribu- 
tion of matter using smoothing scales of 10 — 15 h^^ Mpc, 
giving way to clumps with ^ 5 Mpc smoothing. 

The fact that the eigenvalues of the Hessian can 
be used to discriminate different types of structure 
in a particle distribution is fundament al to a num- 
ber of structure- finding alg o rithms (e.g., 'Colombi ct al.' 
20001: iHahn et al. 2007a r lAragon-Calvo et al. 2007; 
Forero- Romero et al.l i2009f ) . However, the relationship 



between A-space and a particular structure is not always 
trivial. For example, one might think that a filamen- 
tary grid cell would have two positive and one negative 
eigenvalue. This will be true near the centre of a fil- 
ament connecting two overdense filament ends, but in 
the vicinity of the overdensities or in the case that the 
filament ends at an underdensity, all three eigenvalues 
will become negative. In addition, when working with a 
smoothed density field, these criteria select regions that 
are near clumps and do not necessarily lie along the fil- 
ament. Finally, these criteria disregard the structure's 
width - for example, the regions away from the centre of 
the filament may have positive values of A2. 

In this paper, we will describe a procedure to iden- 
tify filaments in the three-dimensional galaxy distribu- 
tion using an algorithm called the Smoothed Hessian 
Major Axis Filament Finder (SHMAFF), and compare 
their properties in cosmological N-body simulations to 
those in the SDSS galaxy redshift survey. We describe 
our methodology, which uses the eigenvalue s and eigen- 
vecto rs of the smoothed Hessian matrix (see iBond et al.l 
[2OO9I hereafter. Paper 1), in § [2 In § El we run the 
code with a range of possible input parameters and jus- 
tify our choices for each. We discuss the behavior of 
the algorithm when used on Gaussian random fields in 
§ m allowing us to distinguish those features of the large- 
scale distribution of matter that are a direct consequence 
of the non-linear growth of structure. In § [5l we use 



mock galaxy catalogues to estimate the incompleteness 
and contamination rates of filament samples and then 
use these quantities to interpret the distribution of fila- 
ments found in the SDSS (§|6]). In §[7] we summarize our 
results and discuss the implications of our findings. 

2. FINDING INDIVIDUAL FILAMENTS 

Filaments, clusters, and walls all present sharp features 
in the density field along at least one of their principal 
axes. In Paper 1, we described a procedure to gener- 
ate a matrix of Gaussian-smoothed second derivatives 
of the density field (the Hessian matrix) at each grid 
cell, computing its eigenvalues, Xi (defined such that 
Al < A2 < A3), and eigenvectors, Ai. For the test- 
ing and development of the algorithm, we ran a series 
of cosmological iV-body simulations, using a particle- 
mesh code with = 0.29, fl\ — 0.71, as — 0.85, 
and h = iJo/(100 km s'^ Mpc~i)= 0.69 (see Paper 
1 for details). The simulation is performed within a 
200 h^^ Mpc box with 512'^ particles, each with mass, 
mp = 4.77 X 10^ /i-i Mq. 

In order to generate a three dimensional distribution 
of mock galaxies, we first identify dark matter halos 
within the particle distri bution using the HOP algorithm 
(|Eisenstein fc Hud 1 19981 ) and then populate them using 
the halo occ u pation distribution and parametrization of 
I Zheng et all |2007l see Paper 1 for details). The re- 
sulting mock galaxy distribution is smoothed using a 
Gaussian kernel and its second derivatives, yielding a 
128 X 128 X 128 grid with Hessian eigenvalues and eigen- 
vectors in each cell. In Fig. [TJ we plot a slice from the 
simulation 10 Mpc deep and 27.21 h^^ Mpc on a 
side, chosen to encompass a prominent filamentary struc- 
ture. Shown are the galaxies (upper left), galaxy density 
map (upper right), and Ai map (lower left and right), 
smoothed with a I = 2 Mpc kernel to bring out the 
filament. The structure appears most clearly in Ai, so 
we construct a list of grid cells, G, ordered by increasing 
value of Al. Before marking the first filament, we remove 
from G all grid cells that satisfy any of the following cri- 
teria, 



Al > 
A2 > 
P< P, 



(1) 



where p is the mean density of objects making up the 
density field. The Ai and A2 thresholds follow from the 
definition of a filament - the density field must be con- 
cave down along at least two of the principal axes. 

The first element in G (the most negative in Ai) is 
marked with a cross in the lower left panel of Fig. [TJ 
From this starting point, we trace out the filament in 
both directions of the 'axis of structure' (parallel and 
antiparallel to A3), taking steps equal to the grid scale of 
1.5625 Mpc. Subsequent filament elements are not 
constrained to lie on the grid, so we use a third-o rder 
polynomial interpolation scheme (jPress et al.l Il986[ ) on 
the grid to obtain the local Hessian parameters. If, at any 
point along the filament, the angular rate of change of the 
axis of structure exceeds a threshold, C, we stop tracing 
and mark the point as a filament end. The stopping 
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Fig. 1. — Slice from a cosmological simulation 10 h~ Mpc deep 
and 27.21 Mpc on a side, encompassing a prominent illamen- 
tary structure. Shown are the galaxies (upper left), density map 
(upper right), and Ai map (lower two panels), where smoothing is 
performed on a scale of i = 2 /i-l Mpc. The cross in the lower 
left panel indicates the minimum value of Ai on the slice. This 
will be the starting point for the first filament traced by the al- 
gorithm. The circle around the cross has a radius equal to the 
removal width (see text). The solid lines in the lower right panel 
indicate the filaments as traced by a 2D version of SHMAFF. 



condition at step m is given by, 

|A3,m X A3,m_i| > sin(CA), (2) 

where A, the grid ceU size, is also the size of each step. 
The filament finder will also stop and mark a filament 
end if it passes into a cell that satisfies one or more of 
the criteria specified in Equation [TJ In the lower right 
panel of Fig. [1] we show the filaments that result from 
a sample run of SHMAFF on a 27 x 27 x 10 Mpc"^ 
slice from the dark matter particle distribution in our 
cosmological simulation. 

For each step along a filament, all grid cells within a 
removal width, W, of the most recently chosen filament 
element, are removed from G, where 



W,^K0 (3, 

In order to avoid tracing a filament more than once, sub- 
sequent filaments cannot start within one of the removed 
cells. They may, however, extend into a removed cell, so 
long as the cell is not excluded by any of the criteria given 
in Equations [T] and [5J For a cylindrical filament with 
a Gaussian cross section extending into a zero-density 
background, a value oi K — 1 should exclude those parts 
of the structure that are not already excluded by Equa- 
tion [T] 

The filaments traced by the above algorithm may be 
offset from the ridges in the initial point field because 
of the finite resolution of the grid. Thus, we adjust the 
position of a filament element, j, based on the average 
perpendicular displacement of nearby grid cells from the 



Fig. 2. — Filaments found in the z = dark matter distribution 
of a cosmological simulation are plotted (in red) over a subsample 
of dark matter particles (black) . Filaments are found in a density 
field smoothed with the kernel length indicated below each box. 
For 1 = 5 Mpc, we only show an octant of the full simulation 
box, as the full filament distribution is so rich that the figure for 
the full box would be too crowded. 

filament axis, 
where, 

Ri = Aj X (Aj X (xj - Xi)). (5) 

Here, Aj is the unit vector along the axis of structure 
(with arbitrary sign) and N is the number of objects in 
the initial point field that are within a smoothing length. 
Application of the centering algorithm can result in frag- 
mented filaments when shot noise is non-negligible, so 
we will not run it on point distributions with very sparse 
sampling, such as the SDSS galaxy distribution. 

3. FILAMENT-FINDING PARAMETERS 

In the filament-finding routine described in § [2l there 
are two free parameters, the curvature criterion for iden- 
tifying the filament ends, C, and the width of filament re- 
moval, K (see Equation[3]). In principle, the optimal val- 
ues of these parameters can be functions of the smooth- 
ing scale, redshift, or type of tracer (e.g., galaxies, dark 
matter particles), so it is important to understand their 
impact on the algorithm's output. In this section, we 
will test the performance of the code on the distribution 
of dark matter particles in our cosmological simulation 
as a function of K , C, and the sampling rate. 

In Fig. [51 we show a run of the filament finder on the 
z — Q dark matter distribution of the cosmological sim- 
ulation, using C = 40° and K = 1. Output is shown 
for smoothing with I = 15, 10, and 5 h^^ Mpc, illustrat- 
ing the scale-dependence and coherence of the cosmic 
network. Any given filament will be found on a range 
of scales, depending on its width and length, but as the 
smoothing length is made smaller, the filament will be 
broken up into substructures which will themselves be 
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Fig. 3. — Length distribution for a sample of filaments found in 
the 2 = smoothed dark matter distribution (smoothing kernel 
width of I = 5 Mpc). The dashed line indicates the smoothing 
length, below which filaments are removed from the sample. 

filamentary or clump-like (see figures 15 and 16 in Pa- 
per 1). 

3.1. Sampling the filament length distributions 

Before defining the parameters that are used to find 
filaments, we must decide on what we are willing to ac- 
cept as a real filament. An isolated spherical overden- 
sity should not be viewed as a filament, but the filament 
finder would treat it as a very short ridge, tracing it 
from its centre until random fluctuations caused the axis 
of structure to deviate more than C, producing a 'short 
filament'. Fig. [3] shows the raw distribution of filament 
lengths for our dark matter simulation shown in Fig. [2l 
using C — 30° l^-^, K — 1, and a smoothing length of 
5 h~^ Mpc. Not surprisingly, the distribution exhibits a 
dramatic drop-off below a smoothing length. With this 
in mind, we hereafter discard filaments whose lengths are 
shorter than the smoothing length as non-physical. 

3.2. The C parameter 

The traditiona l picture of larg e-scale structure as 
a 'cosmic web' ([Bond et al.l Il996f ) suggests that fila- 
ments are connected, one-dimensional strands that end 
abruptly at their points of intersection. As one fila- 
ment begins and another ends, the local axis of structure 
should change direction rapidly. The C parameter de- 
notes the maximum angular rate of change in the axis of 
structure along a filament. If this threshold is exceeded, 
filament tracing is stopped. 

In order to test the sensitivity of the output filaments 
to the value of the C parameter, we set i^T = 1 and gen- 
erated filament networks in the N-body simulation with 
a range of C. In all of these tests, increasing the value 
of C led to an increase in the average length of the fil- 
aments and a decrease in the total number of filaments 
found. If the curvature criterion is not strict enough, a 
filament will be traced past its vertex and into another fil- 
ament. Since our algorithm only prevents filaments from 
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Fig. 4. — Total length of the output filament network (solid lines) 
and the fraction of 'repeat detections' (dashed lines) as a function 
of the curvature criterion, C (left) and K (right), The value of C 
determines the filament ending points and the value of K deter- 
mines the distance from filaments that pixels are removed from 
further consideration by the filament finder. Filaments were iden- 
tified on three different smoothing scales, Z = 15 Mpc (blue), 
; = 10 ?t~^ Mpc (green), and I = 5 Mpc (red). The total length 
of the network (after removing repeat detections) maximizes at a 
value of C that depends on smoothing length. 

starting within previously-identified filaments (they are 
allowed to cross one another), this can lead to double 
detections of filaments. We can obtain a rough count of 
these double detections by comparing filament elements 
to one another, where a filament element is defined as a 
single step (of interval. A) on the grid. In other words, 
for each step along a given filament, we find the closest 
filament element that is not a member of that same fila- 
ment. If the closest filament element is within a smooth- 
ing length and has an axis of structure within C, then 
the original element is labelled a 'repeat detection.' The 
total number of repeat detections in an output filament 
network is denoted by R. The total length of the network 
at this scale is therefore given by 



Lf = {N,-R)A, 



(6) 



where is the total number of filament elements found 
and A is the step size taken by the filament finder. Non- 
filamentary regions of space have already been excluded 
by the criteria in Equation [1] so an optimum set of pa- 
rameters will maximize Lf while minimizing R. 

In the left panel of Fig. 21 wc plot both the fraction 
of repeat detections {R/N^, dashed lines) and the total 
length of the network {Lf, solid lines) as a function of 
C. On all smoothing scales, the fraction of false positives 
rises steadily with increasing C, with no obvious breaks 
or minima. The total length, however, tends to rise until 
it reaches a maximum, after which point it either flattens 
or falls slowly. This suggests that, as long as the curva- 
ture criterion is above a critical value, the algorithm will 
trace out the entire filament network. Since the fraction 
of false positives rises with C, we will hereafter use a 
curvature criterion near this value; that is, C = 50, 40, 
and 30° l^^ for I = 15, 10, and 5 Mpc, respectively. 
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3.3. The K parameter 

As each filament is found, we wisli to remove from tlie 
grid as mucli of it as possible without preventing the de- 
tection of further real filaments. Using the previously- 
determined critical values of C, we ran the filament- 
finder with a range of K and computed the total length 
of the filament network and the fraction of repeat detec- 
tions as a function of K. The results are shown in the 
right panel of Fig. 21 All of the curves are monotonic, 
with repeat detections and the network length decreasing 
with increasing K. Hereafter, we will set K — 1 because 
it yields R/N^ < 20 per cent. 

3.4. Effects of sparse sampling 

In real galaxy catalogues, the number of galaxies per 
smoothing volume will sometimes be small and it is im- 
portant to understand the impact of shot noise on the 
algorithm's ability to trace the filament network. In a 
density field with sparse sampling, shot noise will create 
spurious filament detections in addition to the 'repeat 
detections' described in § 13.21 We have an effectively 
shot-noise-free density field in the dark matter particle 
distribution (with the simulation using a mean dark mat- 
ter particle density of 17 particles Mpc"'^), so we per- 
form sparse sampling on this field and use the complete 
particle distribution as a standard for comparison. We 
construct three such data sets, sampled to densities of 
5 X 10~^, 2 X 10"^ and 1 x 10"^ particles Mpc^^, 
matching the densities of the real galaxy samples to be 
presented in § [6l For each sample, we recompute the 
SHMAFF parameters and run the filament finder on all 
three smoothing scales, using the parameters derived in 
previous sections. We will call a 'false positive' any fila- 
ment element found in the sparsely sampled data whose 
nearest neighboring element in the 'true' filament net- 
work is more than a smoothing length away or does 
not have an axis of structure within an angle equal to 
Cxi. Similarly, incompleteness is quantified by count- 
ing the filament elements in the 'true' network that have 
no counterparts in the sparse-sampled one. 

As illustrated in Fig. [SJ the incompleteness and con- 
tamination rates of individual filament elements are 
strong functions of Ai for the 'weakest' edges, but these 
make up only a small fraction of the filament network. 
Our tests (not shown) suggest that the incompleteness 
and contamination rates are < 20 per cent so long as 
there are an average of > 5 particles within spheres of 
radius equal to the Gaussian smoothing length. See § [5] 
for a more detailed analysis of completeness and contam- 
ination in mock galaxy samples. 

4. FILAMENTS AS NON-GAUSSIANITIES 

Gaussian random fields serve as an important refer- 
ence point if we wish to distinguish the consequences 
of the non-linear growth of structure from phenomena 
seen only in the linear regime. We know that Gaus- 
sian random fields are not filamentary and one might 
question why we should find any filaments in such a dis- 
tribution. Bear in mind, however, that the SHMAFF 
algorithm traces any negatively curved region and these 
conditions will certainly be met by some of the overden- 
sities in a Gaussian random field. We demonstrated in 
Paper 1 that although the smoothed Ai distributions ap- 
peared 'filamentary' in both a Gaussian random field and 
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Fig. 5. — Incompleteness and contamination of filament pixels 
as a function of Ai (scaled to the mean density and smoothing 
length) in the filament elements of a dark matter particle distri- 
bution sparse-sampled to match the density of Mr < — 20 galaxies 
(5.0 X 10"'^ h^ Mpc~^). The sparse-sampled field was smoothed on 
a scale of 5 Mpc, yielding 2.6 particles per smoothing volume, 
and its filaments were compared with those in the full dark mat- 
ter particle distribution smoothed on the same scale. Their total 
incompleteness and contamination rates were both ~ 20 per cent. 



the evolved dark matter distribution, the latter showed 
alignment between the axis of structure and these fila- 
mentary minima in Ai. The filament-finding algorithm 
enables us to follow the axis of structure and trace out 
individual large-scale structures in each di stribution. 

Us ing the z = linear power spectrum (jSpergel et all 
I2007D with corre ctions in the non-linear regime 
llSmit h et al.l 120031 the same one used in the N-body 
simulations discussed here), we generate a continuous re- 
alization of a three-dimensional Gaussian random field. 
We use l ^ 5 h'^ Mpc, C = 30° r\ if = 1 to de- 
rive the filament distribution shown in the upper right 
panel of Fig. [6] and compare it with identical runs on 
the z = dark matter distribution in the cosmological 
simulation (upper left panel). The qualitative differences 
between the two are substantial. While the output for 
the dark matter distribution resembles a noded network, 
with filaments converging and ending at vertices in the 
network, the 'filaments' in the Gaussian random field ap- 
pear more randomly oriented and show no apparent cor- 
relations with one another. Using 5 h^^ Mpc smoothing, 
the filament length distributions for the Gaussian ran- 
dom field and dark matter distribution are shown in the 
centre panel of Fig. [G] The distributions are very similar 
and clearly exponential above a length of ^ 10 Mpc, 

-O.IL 

with N{L) 10 "p"^ , suggesting that filaments have not 
collapsed much along their longest axis since their for- 
mation, but have changed their alignment in relation to 
nearby structures. 

We will define the width of a filament element, W, to 
be the root mean squared perpendicular offset of particles 
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Fig. 6. — Distribution of filaments in the dark matter distribution 
(no sparse sampling, upper left panel) and a Gaussian random field 
(upper right) with the ACDM z = non-linear power spectrum. 
The filaments are found on Z = 5 Mpc scales and we only plot 
subsections of the full 200 Mpc boxes. In the centre and bot- 
tom panels, we show the filament length and width distributions, 
respectively, for the dark matter distribution (solid line) and Gaus- 
sian random field (dashed line). In both cases, the filament-finding 
algorithm was run with C = 30° l^^ and K = 1. 

within a smoothing length; that is, 

W = (7) 

where Rj is defined in Equation [5] and the sum is over 
all of the N particles within one smoothing length of 
the filament element. In the bottom panel of Fig. [6l 
we plot the width distributions for the two fields, again 
using I = 5 Mpc. The dark matter width distri- 
butions are broader and are peaked at smaller widths, 
suggesting that the filaments have collapsed significantly 
along two of their principal axes, despite having a similar 
length distribution. As one would expect with bottom-up 
structure formation, the width distribution in the Gaus- 
sian random field and dark matter distribution are more 
discrepant at smaller smoothing scales (other scales not 
shown) . 

4.1. Filament evolution 

In Paper 1, we showed that on a given comov- 
ing smoothing scale, there was evidence for a wall-to- 
filament-to-clump evolution with cosmic time. Further- 
more, we showed that the axis of structure aligns with 
the filamentary backbone in two-dimensional slices from 
cosmological simulations as early as z = 3 (see figure 14 
in Paper 1). Fig. [7] shows the filament distribution at 
z = and z — 3, now with I — 15 Mpc so as to test 
the largest and least-evolved structures in the simulation 
box. We used a smaller removal width, K = 0.6, for 
the z = 3 filament distribution because the filaments are 
of lower contrast than at z = 0, causing Equation [3] to 
overestimate their sizes. The 2 = 3 and z = filament 
distributions are very similar to the eye, suggesting that 




Fig. 7. — Dark matter filament distributions at 2: = 3 and 2 = 
after I = 15 Mpc smoothing. The former was found with a 
smaller value of K because the algorithm tends to overestimate 
filament widths when the filaments are of low contrast (see text). 
Both the number and spatial distribution of filaments appear to be 
similar at the two redshifts, but the width distributions are not, as 
shown in the right panel. Hero, the filament width distributions at 
2 = 3 (dashed line), 2 = 1 (dotted line), and 2 = (solid line) are 
plotted. 

the basic filament framework for / = 15 h^^ Mpc is al- 
most entirely in place at z = 3 (where 15 Mpc fiuc- 

/ 9\l/2 

tuations have ({AM/M) j 0.1). The righthand 

panel of Fig. [7] shows the filament element width distri- 
butions as a function of redshift. As non-linear evolution 
proceeds, the filament width distributions broaden and 
peak at smaller widths. 

5. FILAMENTS IN THE MOCK GALAXY CATALOGUES 

Before we proceed to identify filaments in the SDSS 
data, we run the filament finder on the mock galaxy 
samples in redshift space (see Paper 1) and compare the 
resulting filaments to those identified in the real-space 
z = dark matter distribution. The I = 5 Mpc fila- 
ment distribution is very strongly affected by redshift 
distortions - the contamination rates are typically ~ 
40 per cent, about double the contamination of the 
filament samples without redshift distortions. This is 
due primarily to the 'finger-of-god' effect, which causes 
galaxy clusters to extend into narrow, sharp filament- 
like features along the line of sight. Fortunately, the 
filament finder is insensitive to these distortions on 
10 Mpc and 15 Mpc scales because the fingers- 
of-god are typically ^ I Mpc in width. Nevertheless, 
we can improve our results if we first remove the fingers- 
of-god. 

5.1. Identification and removal of fingers-of-god 

Fingers-of-god from galaxy clusters are extended along 
the observer's line of sight, while real filamentary struc- 
ture have no preferred direction. In order to separate 
the fingers-of-god from the real filaments, we will use a 
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1=10 h-' Mpc 1=15 h-' Mpc 



Fig. 8. — Two volume— limited samples taken from the SDSS 
VAGC large-scale structure sample, with galaxies placed at their 
comoving positions based on the concordance cosmology. The 
arrows indicate the location of the Milky Way, which is r = 
(310, -20, 170) /i-i Mpc and r = (400, -25, 200) h-^ Mpc in the 
Mr205 and Mr2l samples, respectively. The z axes are parallel to 
the galactic north pole. 

friends-of-friends algorithm with two hnking lengths, 



= ^\h\^-bl (8) 

where f is the unit vecto r along the observer's line of sight 
(jHuchra fc Gellerill982j ). With these two parameters de- 
fined, the algorithm searches for cylindrical str uctures 
with a diameter-to-length ratio of B erlind et alj 

(e.g. I2006L hereafter B06) did an exhaustive study of 
this two-parameter space and found that b± = 0.14 and 
6 II — 0.75 gave unbiased estimates of the group multi- 
plicity function, so we adopt these values in our study. 

5.2. Filaments after cluster collapse 

All of the tests in this section were performed on the 
samples of mock galaxies with density similar to that 
of Mr < —20 galaxies. First, we removed galaxy clus- 
ters from the real space mock galaxy distribution us- 
ing an isotropic friends-of-friends algorithm with fe|| = 
6x = 0.2 and a minimum group size of Nmin = 5. For 
/ — 10 Mpc, the filament incompleteness and con- 
tamination rates for the cluster-free filament distribution 
(33 and 39 per cent, respectively) are much larger than 
those in real space (16 and 25 per cent), suggesting that 
overdensities on megaparsec scales are playing an im- 
portant role in defining filaments on 10 Mpc scales. 
Similar results are obtained when clusters are found and 
removed in redshift space using the approach of § 15.11 

If we instead collapse the fingers-of-god presented by 
galaxy clusters, we can remove most of the contamina- 
tion without having to remove the clusters themselves. 
For this study, we will take the very simple approach 
of moving all members of a particular cluster to their 
mean position - that is, we will collapse the fingers-of- 
god to a point weighted by number of galaxies in the 



Fig. 9.— Filaments found in the Mr205 (left, I = 10 Mpc) 
and Afr21 (right, I = 15 h'^ Mpc) SDSS samples. These are the 
full filament samples after the filament finder was run with the 
'best' parameters. Note that the boxes are different (overlapping) 
volumes of space and thus are not directly comparable to one an- 
other. 

cluster. If we follow this procedure, the incompleteness 
and contamination are smaller (19 and 26 per cent) than 
the cluster-free mock galaxy distributions and a marginal 
improvement over the redshift space distribution with no 
special treatment of clusters (20 and 27 per cent). 

We repeated this exercise for filaments found on a 
5 Mpc smoothing scale. Collapsing the fingers-of- 
god does lead to a marginal improvement, but filaments 
are still very poorly defined in redshift space at these den- 
sities, with 40 per cent contamination rates. A more 
sophisticated treatment of the clusters may be needed, 
but is beyond the scope of this paper. In the section 
that follows we will discuss the application of the fila- 
ment finder to real SDSS data. To minimize contam- 
ination, we will be working only with filaments found 
on 10 Mpc and 15 Mpc scales and only after 
collapsing fingers-of-god. 

6. FILAMENTS IN THE SDSS GALAXY DISTRIBUTION 

The Sloan Digital Sky Survey has imaged a quar- 
ter of the sky in five wavebands, ranging from 3000 
to 10000 A, to a dept h of r ~ 22.5 (lYork et a l. 2000). 
As of Data Release 6 (jAdelman-McCarthv eTal] 12008). 
spectra had been taken of ^ 800,000 galaxies, covering 
9583 square degrees and extending to Petrosian r ^ 17.7 
(jStrauss et al.ll20(j^ . Galaxy redshifts are typically ac- 
curate to ~ 30 km s~^, making it ideal for studies of 
large-scale structure. For this study, we need a portion of 
sky with relatively few coverage gaps to minimize the ef- 
fect of the window function on the A-space distributions. 
With this in mind, we construct two volume-limited sub- 
samples from the northern portion (8 < a < 16 h and 
25 < <5 < 60) of t he NYU Value-Adde d Galaxy Cata- 
log (NYU- VAGC, IBlanton et all [20051 through DR6), 
the first 140 x 140 x 340 (/i"^ Mpc)^ in size with 
Mr < -20.5 (Mr205) and the second 170 x 170 x 
400 (/i-i Mpcf in size with Mr < -21 (Mr21). The 
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samples extend to maximum redshifts of z = 0.12 and 
z = 0.15, respectively, and are plotted in redshift space 
in Fi g. [HI Absolute magni tudes were computed with kcor- 
rect (jBlanton et al.l [20031 ) using SDSS r-band Petrosian 
magnitudes shifted to z = 0.1 (and using h—\). 

We described the compilation and processing of the 
SDSS subsamples and their mock counterparts in Pa- 
per 1. Before generating the filament distributions, we 
identify and collapse the fingers-of-god as described in 
the last section. After performing this procedure on the 
Afr205 and A/r21 samples (both real and mock), we 
smooth the former with I = 10 Mpc and the latter 
with / = 15 Mpc. These choices maximize the vol- 
ume covered while keeping the sampling rate high enough 
that filament contamination is under ^ 25 per cent (see 

§[S3. 

We run the filament finder on the Mr205 and Mr21 
galaxy samples using C = 40° and C = 50° 
respectively, and K — 1. The resulting filaments are 
shown in Fig. [9l After removing filaments shorter than 
a smoothing length, the algorithm finds 489 filaments in 
Mr205, having a total length per unit volume of 1.9 x 
10^3 /j2 ]vipc~2 ^ iQ /j-i Mpc), while in Mr21, 226 
filaments are found with a total length per unit volume of 
7.6 X 10^"^ Mpc~^ (l — 15 Mpc). For comparison, 
the mock Mr205 catalogue contains 451 filaments with 
a total length per unit volume of 1.7 x 10^"^ Mpc^ 
(/ = 10 Mpc) and the mock Mr21 catalogue contains 
235 filaments with a total length per unit volume of 8.2 x 
IQ--* h"^ Mpc^ {I = 15 h-^ Mpc). Thus, the number 
density of filaments in the simulations closely matches 
that in the real universe. 

We found in § [4| that, above two smoothing lengths, 
dark matter filaments had an exponential length distri- 
bution that very closely matched that found in a Gaus- 
sian random field with the same power spectrum. This 
suggests that, even if the filaments in the data are in 
a different stage of their evolution (i.e., having different 
as) than those in the simulations, the length distribu- 
tions should be the same between the two. This does 
appear to be the case, as shown in Fig. [TUj 

More interesting is the similarity of the width distri- 
butions of filament elements, shown in Fig. [TT] In the 
SDSS, we find mean filament widths of 5.5 Mpc and 
8.4 h-^ Mpc on 10 h'^ Mpc and 15 h~'^ Mpc smooth- 
ing scales, with standard deviations of 1.1 Mpc 
and 1.4 Mpc, respectively. As was demonstrated 
in Fig. [71 filament element width distributions broaden 
and shift to smaller widths as non-linear evolution pro- 
ceeds. A large discrepancy in, for example, as between 
the simulations and real data should produce filament 
populations that are at different stages of non-linear evo- 
lution and have different width distributions. As such. 
Fig. [TT] suggests that the SDSS filaments are both con- 
sistent with the standard model and consistent with the 
set of cosmological parameters used in the simulation. 

7. RESULTS AND DISCUSSION 

This paper develops and uses an algorithm called the 
Smoothed Major Axis Filament Finder to identify indi- 
vidual filaments in large-scale structure. In short, it uses 
the local eigenvectors of the density second-derivative 
field to define the filament axis and trace individual fila- 
ments. Filament ends are defined as points at which the 




L (h-' Mpc) 

Fig. 10. — Length distributions of the Mr205 (bottom) and Mr21 
(top) filament samples, with SDSS filaments plotted with a solid 
line and redshift-space mock catalogues with a dashed line. 




W (h-' Mpc) 

Fig. 11.— Width distributions of the Mr205 (bottom) and Mr21 
(top) filament samples, ith SDSS filaments plotted with a solid 
line and redshift-space mock catalogues with a dashed line. The 
SDSS filaments in both samples are consistent with those in the 
cosmological simulations, suggesting that they are in similar stages 
of non-linear evolution. 

rate of change of the axis of structure exceeds a specified 
threshold (see § [2|). In a ACDM cosmological simula- 
tion, this definition produces filament samples that are 
consistent with our visual impression of structure on a 
particular scale, are complete with few duplicate detec- 
tions 13. 2p . and are robust to sparse sampling 13. 4p . 

In addition to the smoothing scale, the filament finder 
takes the input parameters C, the maximum angular rate 
of change of the filament axis, and K, the width of fil- 
ament removal in units of the smoothing length. Using 
Gaussian smoothing, the 'best' values of these input pa- 
rameters are C = 30, 40, and 50° on 5, 10, and 
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15 h^^ Mpc smoothing scales, respectively, and K — 1 
for all smoothing scales. After we collapse the fingers-of- 
god, contamination and completeness in filament sam- 
ples found in the mock Mr < —20.5 galaxy distribu- 
tion are 26 and ^81 per cent, respectively. Galaxy 
clusters are important for defining large-scale filaments 
and should not be removed before running the filament 
finder. In rcdshift space and on smoothing scales above 
^ 10 h^^ Mpc, collapsing fingers-of-god to their mean 
position produces mock filament samples comparable to 
those in real space. 

In this paper, we presented two volume-limited sub- 
samples from the northern portion of the SDSS spec- 
troscopic survey (using the NYU-VAGC catalogue) 
and computed their filament distributions on 10 and 
15 h^^ Mpc smoothing scales. These distributions were 
then directly compared to those found in a series of 
redshift-space mock galaxy catalogues generated from a 
cosmological simulation using the concordance cosmol- 
ogy. The filament length distributions found in SDSS 
data are very similar to those found in mock catalogues 
and are consistent with being drawn from an underlying 
exponential distribution. The width distributions of fil- 
ament elements are also very similar between the SDSS 
data and mock catalogues, suggesting that real filaments 
are consistent with those in a ACDM universe having 
(78 = 0.85, = 0.71, rim = 0.29, and h = 0.69. Tests 
on a range of cosmological simulations are needed before 
this can be turned into a cosmological constraint. 

We also generated filament distributions at six red- 
shifts in the output of a ACDM cosmological N-body 
simulation, from z = 3 to 2 = 0. The orientation of 
the filament network is stable out to z = 3 on comov- 
ing smoothing scales at least as large as 15 h^^ Mpc. 
Most of the filaments detected on 15 Mpc scales at 
z = can be detected at z = 3. In addition, on a given 
comoving smoothing scale, filament width distributions 
shift to smaller widths as the filaments continue to col- 
lapse. Narrower filaments will collapse more rapidly, so 
this also leads to a broadening of the width distributions. 

We have demonstrated that our filament finder is able 
to locate and follow real structures, perhaps most strik- 
ingly in § 14.11 in which we showed that many of the same 
structures could be seen in a cosmological simulation at 
both z = 3 and z = 0. There is some subjective free- 
dom in deciding what constitutes the end of a filament, 
as no single physical threshold stands out as a discrim- 
inator. Nevertheless, we demonstrated in § 13.21 that the 
total length of the cosmic network is insensitive to the 
choice of C above a certain scale-dependent threshold 
(once double detections are removed). The minimum 
value of C needed to probe the entire filament network 
may be telling us about the intrinsic dumpiness of fila- 
mentary structure, and may therefore be able to distin- 
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guish models of warm and cold dark matter. 

In this paper, we fully developed the SHMAFF algo- 
rithm and applied it to the low-redshift galaxy distribu- 
tion, but there is much that can still be learned from 
its application to redshift surveys. The filament evolu- 
tion seen in cosmological simulations fsee ^ | 4.1l) can be 
tested in the DEEP2 galaxy survey ()Davis et all 120031) 
at z ~ 1 , and the results of th i s com parison have already 
been presented in lChoi et all (|2010D . On I = 5 h'^ Mpc 
and I = 10 Mpc scales, they confirm a shift in the fil- 
ament width distribution to smaller widths from z 0.8 
to z ^ 0.1, as well as a broadening of the filament width 
distribution. A possible extension of this work is a careful 
test of the ACDM cosmological model, including pre- 
cision constraints on cosmological parameters, such as 
(Tgj and tests for primordial non-Gaussianity using the 
length distribution of filamentary structures. In addi- 
tion, it would be useful to elaborate on the relationship 
of large-scale filaments to galaxy clusters and to explore 
the properties of galaxies in filaments relative to the gen- 
eral galaxy population. Finally, it would be interesting to 
conduct a careful search for walls in SDSS. Paper 1 hinted 
at their presence in the data, but they were only present 
at low contrast and the A-space distributions were not 
optimal for identifying individual wall-like structures. 
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